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Abstract 

Assessing student performance is one of the most critical aspects of the job of a 
classroom teacher; however, many teachers do not feel adequately prepared to assess 
their students’ performance. In order to measure and compare preservice and inservice 
teachers’ “assessment literacy,” both groups were surveyed using the Classroom 
Assessment Literacy Inventory (CAL/), which was designed to parallel the Standards for 
Teacher Competence in the Educational Assessment of Students. Inservice teachers 
performed highest on Standard 3 — Administering, Scoring, and Interpreting the Results 
of Assessments and lowest on Standard 5— Developing Valid Grading Procedures. 
Preservice teachers performed highest on Standard 1 — Choosing Appropriate Assessment 
Methods and lowest on Standard 5— Developing Valid Grading Procedures. 

Comparisons between the two groups revealed significant differences on five of the seven 
competency areas, as well as on the total scores. In all cases where significant differences 
were found, the inservice teachers scored higher than their preservice counterparts. 
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Preservice Versus Inservice Teachers’ Assessment Literacy 
Does Classroom Experience Make a Difference? 

Introduction 

Assessing student performance is one of the most critical aspects of the job of a 
classroom teacher. It impacts nearly everything that teachers do. For example, aspects of 
a teacher’s job that are impacted by assessment include, but are not limited to, the 
following: 

• guiding decisions about large-group instruction; 

• developing individualized instructional programs; 

• determining the extent to which instructional objectives have been met; 

• providing information for administrative decisions, such as promotion, retention, 
or graduation; and 

• providing data for state or federal programs. 

With respect to classroom assessment, there exists a paradox in our educational 
system. Accurate assessment of achievement is being more urgently called for at the 
district, state, and national levels (Rogers, 1991). Various reform efforts are forcing 
teachers to be held accountable for their assessment of student learning. However, 
teachers do not feel adequately prepared to meet this challenge. Classroom teachers are 
calling for more training due to their perceived lack of preparedness to assess their 
students, citing weaknesses in their undergraduate preparation programs (Rogers, 1991). 
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Review of Related Literature 

Discussed in this review are specific works related to (1) general research on 
classroom assessment, (2) The Standards for Teacher Competence in the Educational 
Assessment of Students, (3) assessment literacy, and (4) specific research studies that 
have been conducted on The Standards and assessment literacy. 

Research on Classroom Assessment 

It has been estimated that teachers spend up to 50 percent of their time on 
assessment-related activities (Plake, 1993). Regardless of the amount of time spent on it, 
classroom assessment is a vitally important teaching function; it contributes to every 
other teacher function (Brookhart, 1998, 1999b). Assessment is used for numerous 
purposes: to diagnose student needs, to group students, to grade students, to evaluate 
instruction, to motivate students, etc. (Stiggins, 1999a). Sound assessment and grading 
practices help teachers to improve their instruction, improve students' motivation to learn, 
and increase students' levels of achievement (Brookhart, 1999a). According to Stiggins 
(1999a), “"The quality of instruction in any ... classroom turns on the quality of the 
assessments used there" (p. 20). For all of these reasons, the information resulting from 
classroom assessments must be meaningful and accurate; i.e., the information must be 
valid and reliable (Brookhart, 1999a). 

In recent years, public and governmental attention has shifted to school 
achievement as evidenced by performance on standardized achievement tests (Campbell, 
Murphy, & Holt, 2002). Additionally, there has been an increase in expectations 
regarding teachers' assessment expertise. Teachers have been required to develop 
classroom assessments that align curriculum with state standards as a means of improving 
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test scores (Campbell, Murphy, & Holt, 2002). New research on the relationship between 
classroom assessments and student performance on standardized tests reveals that 
improving the quality of classroom assessments can increase average scores on large- 
scale assessments as much as 3/4 of a SD (as much as 4 grade equivalents or 15-20 
percentile points), representing a huge potential (Stiggins, 1999a). This is important 
research since it makes a connection between the quality of assessment in the classroom 
and assessment resulting from standardized testing programs. 

Ironically, in this age of increase in emphasis on testing and assessment, many 
colleges of education and state education agencies do not require preservice teachers to 
complete specific coursework in classroom assessment (Campbell, Murphy, & Holt, 

2002; O’Sullivan & Johnson, 1993). This continues to be an interesting phenomenon 
since many inservice teachers report that they are not well prepared to assess student 
learning (Plake, 1993). Furthermore, these teachers often claim that the lack of adequate 
preparation is largely due to inadequate preservice training in the area of educational 
measurement (Plake, 1993). Brookhart (2001) also cites literature that calls for an 
increase in emphasis in teacher preparation programs on classroom assessment and a 
decrease in emphasis on large-scale testing. Studies have generally concluded that 
teachers' skills in both areas are limited. 

Three methods used have been used to investigate teachers’ assessment practices, as 
well as their levels of preparation to assess students: surveys of attitudes, beliefs, and 
practices; tests of assessment knowledge; and reviews of teachers' actual assessments 
(Brookhart, 2001). Regardless of the method used, research has documented that 
teachers’ assessment skills are generally weak (Campbell, Murphy, & Holt, 2002). 
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Stiggins (2001) is in agreement when he states that we are seeing unacceptably low levels 
of assessment literacy among practicing teachers and administrators in our schools. He 
continues by stating that this assessment //-literacy has resulted in inaccurate assessment 
of students, causing them to fail to reach their full potential. 

With respect to teachers’ assessment practices, for example, Mertler (1999) found 
that teachers did not perform statistical analyses of test data (e.g., estimating reliability, 
conducting item analyses) very often. Furthermore, teachers indicated that they followed 
specific steps to insure validity and reliability about half of the time or less (Mertler, 
2000). When asked to list specific steps that teachers follow to insure validity, a wide 
variety (N = 61 1) of responses were offered by the teachers. Only half of those responses 
provided procedures that were appropriate (or at least approximate); about one-third were 
simply not appropriate (e.g., “I check reliability,” “I use statistical analyses,” etc.); less 
than 20% focused on content-related evidence of validity (which is most appropriate for 
teacher-made tests); numerous teachers provided "procedures" that were troubling, to say 
the least (e.g., “It can't be done,” “I don't have time,” “I don't know what validity even 
is,” “teachers don't have time for this,” and “You'll just figure out what works for you”). 

When asked to list specific steps that teachers follow to insure reliability, again a 
wide variety (N = 431) of responses offered (Mertler, 2000). Only 10% indicated that 
they used statistical analyses (the appropriate response); over half said they are 
automatically reliable if you use teacher-made tests, or provided other troubling 
comments (e.g., “There are no specific steps,” “I have no time to do this,” “Is there really 
a difference between validity and reliability?,” and “Worrying about reliability is way 
down on list of priorities”). 
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With respect to teachers’ levels of assessment preparation, for example, over 70% 
of teachers responding to a national survey reported exposure to tests and measurement 
content (either through a course or inservice training), although for the majority it had 
been longer than 6 years. Those who had had previous coursework/training scored 
significantly higher on a test of assessment literacy than those who hadn't, but the 
difference was less than one point (Plake, 1993). 

When inservice teachers in a statewide study were asked about the level of 
preparedness to assess student learning resulting from their teacher preparation programs, 
the median response was “slightly prepared” (Mertler, 1999). When asked about their 
current level of preparedness, the median response improved to “somewhat prepared.” 
Mertler (1999) concluded that this potentially implies that teachers tend to develop 
assessments skills on the job, as opposed to structured environments such as courses or 
workshops. Stiggins (1999a) has reiterated this implication, stating that many teachers are 
left unprepared to assess student learning as a result of both preservice and graduate 
training; they acquire what assessment “expertise” and skills they possess while on the 
job. 

Brookhart (2001) has quite accurately summarized the research on teachers’ 
assessment practices when she states that teachers apparently do better at classroom 
applications than at interpreting standardized tests (likely due to nature of their work). 
Additionally, they lack expertise at test construction, and they do not always use valid 
grading procedures. 
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“The Standards for Teacher Competence in the Educational Assessment of Students” 
The Standards for Teacher Competence in the Educational Assessment of Students 
(AFT, NCME, & NEA, 1990) were a joint effort between the American Federation of 
Teachers, the National Council on Measurement in Education, and the National 
Education Association. This joint effort began in 1987 in order to “develop standards for 
teacher competence in student assessment out of concern that the potential educational 
benefits of student assessments be fully realized” (AFT, NCME, & NEA, 1990). They 
were originally developed in order to address the problem of inadequate assessment 
training for teachers (AFT, NCME, & NEA, 1990). 

According to The Standards (AFT, NCME, & NEA, 1990), assessment is defined 
as “the process of obtaining information that is used to make educational decisions about 
students, to give feedback to the student about his of her progress, strengths, and 
weaknesses, to judge instructional effectiveness and curricular adequacy, and to inform 
policy.” The Standards, of which there are seven, provide criteria for teacher competence 
with respect to the various components of this definition of assessment. The Standards 
for Teacher Competence in the Educational Assessment of Students consists of the 
following seven principles: 

1. Teachers should be skilled in choosing assessment methods appropriate for 
instructional decisions. 

2. Teachers should be skilled in developing assessment methods appropriate for 
instructional decisions. 

3. The teacher should be skilled in administering, scoring and interpreting the results 
of both externally produced and teacher-produced assessment methods. 
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4. Teachers should be skilled in using assessment results when making decisions 
about individual students, planning teaching, developing curriculum, and school 

improvement. 

5. Teachers should be skilled in developing valid pupil grading procedures that use 
pupil assessments. 

6. Teachers should be skilled in communicating assessment results to students, 
parents, other lay audiences, and other educators. 

7. Teachers should be skilled in recognizing unethical, illegal, and otherwise 
inappropriate assessment methods and uses of assessment information. 

The Standards acknowledge and specify the importance of teacher education and 
professional development in the area of classroom assessment (Brookhart, 2001). All 7 
standards apply to teachers' development and use of classroom assessments of 
instructional goals and objectives that form basis for classroom instruction. Standards 3, 
4, 6, 7 also apply to large-scale assessment, including administering, interpreting, and 
communicating assessment results, using information for decision making, and 
recognizing unethical practices (Brookhart, 2001). 

What is "Assessment Literacy"? 

Several times in this paper, the term “assessment literacy” has been mentioned. 
Since The Standards and the concept of assessment literacy are central to this study, it is 
imperative that the term be defined or otherwise described here. Assessment literacy has 
been defined as “the possession of knowledge about the basic principles of sound 
assessment practice, including terminology, the development and use of assessment 
methodologies and techniques, familiarity with standards of quality in assessment.. .and 
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familiarity with alternative to traditional measurements of learning” (Patemo, 2001). An 
alternative, simpler definition is offered by the North Central Regional Educational 
Laboratory: “the readiness of an educator to design, implement, and discuss assessment 
strategies” (n.d.). 

Others have chosen not to formally define assessment literacy, but rather to describe 

the characteristics of those who possess it. One such characterization is as follows: 

Assessment literate educators recognize sound assessment, evaluation, 
communication practices; they 

• understand which assessment methods to use to gather dependable 
information and student achievement. 

• communicate assessment results effectively, whether using report card 
grades, test scores, portfolios, or conferences. 

• can use assessment to maximize student motivation and learning by 
involving students as full partners in assessment, record keeping, and 
communication (Center for School Improvement and Policy Studies, 

Boise State University, n.d.). 

Another similar description is provided by Stiggins (1995), who states that “Assessment 
literates know the difference between sound and unsound assessment. They are not 
intimidated by the sometimes mysterious and always daunting technical world of 
assessment" (p. 240). He continues by stating that assessment-literate educators 
(regardless of whether they are teachers, administrators, or superintendents) enter the 
realm of assessment knowing what they are assessing, why they are doing it, how best to 
assess the skill/knowledge of interest, how to generate good examples of student 
performance, what can potentially go wrong with the assessment, and how to prevent that 
from happening. They are also aware of the potential negative consequences of poor, 
inaccurate assessment (Stiggins, 1995). 
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Research on Assessment Literacy and “The Standards” 

Numerous research studies have been conducted over the past 10 years that have 
addressed one or more of the seven Standards (Brookhart, 2001). However, only one 
(Plake, 1993) has addressed all teacher competencies— as specified by The 
Standards— for inservice teachers. Additionally, one other study (Campbell, Murphy, & 

Holt, 2002) has attempted to apply The Standards to groups of undergraduate preservice 
teachers. Finally, one other study attempted to integrate The Standards into a graduate- 
level course through the use of performance assessments (O’Sullivan & Johnson, 1993). 

In 1991, a national study was undertaken in order to measure teachers’ assessment 
literacy (Plake, 1993). The Standards were used as a test blueprint for the development of 
the survey instrument used in the study. The survey instrument (the Teacher Assessment 
Literacy Questionnaire ) consisted of 35 items (5 per standard). Items were developed as 
application-type questions— realistic and meaningful to teachers' actual practices. The ) 

instrument went through extensive content validation and pilot testing. A representative 
sample from around country was selected to participate; a total of 98 districts in 45 states 
participated, with a total usable sample of 555 surveys (Plake, 1993). The KR-20 
reliability for the entire test was equal to .54 (Plake, Impara, & Fager, 1993). 

Teachers answered an average of slightly more than 23 out of 35 items correct. The 
teachers’ highest performance occurred on Standard 3 —Administering, Scoring, and 
Interpreting the Results of Assessments (M = 3.96/5.00); the lowest performance occurred 
on Standard 6— Communicating Assessment Results ( M = 2.70/5.00). On 10 of the 35 
items, 90% or more of teachers answered the item correctly. These items addressed issues 
including selecting appropriate assessments, acceptable test taking behavior for 
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standardized testing situations, explanation of the basis for a grade to a child's parent, 
and the recognition of unethical practices in standardized test administration. On 5 items, 
less than 30% answered correctly. Two of the five came from Standard 5 —Developing 
Valid Grading Procedures. Only 13% answered correctly an item that focused on steps to 
increase reliability of a test score. The two remaining items with low performance 
addressed Standard 7 —Recognizing Unethical or Illegal Practices). 

A similar study, conducted by Campbell et al. (2002), attempted to apply the 
identical previously described assessment literacy instrument to undergraduate preservice 
teachers. The renamed Assessment Literacy Inventory (ALT) was administered to 220 

undergraduate students following course in tests and measurement. The course included 

/ 

topics such as creating and critiquing various methods of assessment, discussing ethical 
considerations related to assessment, interpreting and communicating both classroom and 
standardized assessment results, and discussing and evaluating psychometric qualities 
(i.e., validity and reliability) of assessments. 

The data from the undergraduate preservice teachers exhibited a higher level of 
reliability (a = .74) than their inservice counterparts in the Plake et al. study (Campbell, 
Murphy, & Holt, 2002). The preservice teachers (M =21) averaged two fewer questions 
answered correctly than did the inservice teachers ( M = 23). Six items (numbers 5, 7, 22, 
28, 31, and 35) demonstrated poor item discrimination values (< .20). The inservice 
teachers in the Plake et al. study scored higher than the preservice teachers on all but one 
standard (Standard 1 —Choosing Appropriate Assessment Methods). The preservice 
teachers scored highest on Standard 1 , whereas the inservice teachers scored highest on 
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Standard 3. Both groups of teachers scored lowest on Standard 6 —Communicating 
Assessment Results. 

Finally, a third study attempted to integrate The Standards into a graduate-level 
course in measurement. O’Sullivan and Johnson (1993) designed a course that 
incorporated performance assessments (N = 8) which were aligned with The Standards. 
Teachers were pretested ( M = 24.2) and posttested (M = 27.3) using the Plake et al. 
instrument. The results indicated a slight improvement in assessment literacy over the 
duration of the course. 

Purpose of the Study 

It was the intent of this study to investigate the concept of “assessment literacy” and 
attempt to measure it as delineated by The Standards for Teacher Competence in the 
Educational Assessment of Students. Specifically, the purposes of this study were: (1) to 
measure and describe the relative levels of assessment literacy for both preservice and 
inservice teachers, and (2) to statistically compare the relative levels of assessment 
literacy for these two groups. This is the first study that attempts to measure assessment 
literacy for both preservice and inservice teachers and statistically compare the results. 

The research questions addressed in the study were: 

Research Question 1 : What is the level of assessment literacy, as measured by the 
Classroom Assessment Literacy Inventory, for preservice teachers? 

Research Question 2 : What is the level of assessment literacy, as measured by the 
Classroom Assessment Literacy Inventory, for inservice teachers? 
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Research Question 3 : How does the assessment literacy of preservice teachers compare 
to that of inservice teachers? Are there any significant differences between the two 
groups? 

Methods 

Participants 

During the fall of 2002, the researcher surveyed both preservice and inservice 
teachers with respect to their assessment literacy. The group of preservice teachers was 
comprised of 67 undergraduate students, all majoring in secondary education, at a 
midwestem university. At the time of data collection, they were enrolled in methods 
courses (i.e., the term preceding student teaching) and had just completed a course in 
classroom assessment. The group of inservice teachers consisted of 197 teachers, 
representing nearly every district and school in a three-county area surrounding the same 
institution. The schools were selected based on convenience due to their geographic 
location. All grade levels and content areas were represented in the final sample. 
Instrumentation 

Both groups of teachers were surveyed using an instrument titled the Classroom 
Assessment Literacy Inventory, or CALI, which was adapted from a similar instrument 
called the Teacher Assessment Literacy Questionnaire (Plake, 1993; Plake, Impara, & 
Fager, 1993). This inventory is based on the Standards for Teacher Competence in the 
Educational Assessment of Students (AFT, NCME, & NEA, 1990). The CALI consisted 
of the same 35 content-based items (five per standard) with a limited amount of 
rewording (e.g., changing some names of fictitious teachers, changing word choice to 
improve clarity, etc.), as well as 7 demographic items. 
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The original instrument has been shown to have reasonable reliability with both 
inservice teachers, r KR _ 20 = .54 (Plake, Impara, & Fager, 1993), and preservice teachers, 
a = .74 (Campbell, Murphy, & Holt, 2002). Furthermore, the original instrument was 
subjected to a thorough content validation, including reviews by members of the National 
Council on Measurement in Education and a pilot study with and feedback from 
practicing teachers and administrators. 

Procedures 

Inservice teachers were sent the CALI in both paper and Web-based formats. Two 
weeks after the initial mailing of the paper version and posting of the Web-based version, 
teachers were sent a reminder about completing the instrument. The instrument was 
administered to the preservice teachers at the final class meeting in their classroom 
assessment course. They were informed that their individual decision about participation, 
as well as their individual score on the instrument, would in no way affect the grade 
received for the course. 

Analyses 

Descriptive analyses at the individual item level included frequencies and reliability 
analyses; descriptive analyses were also conducted for the seven composite scores (i.e., 
based on The Standards). Inferential analyses included t-test comparisons (evaluated at 
an a-level equal to .05) of the preservice to inservice teacher mean scores for each of 
seven composite scores, as well as the total score for the entire instrument. All analyses 
were conducted using SPSS (v. 11). 
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Results 

The results that follow are presented by each individual research question. 
Research Question 1 ; What is the level of assessment literacy, as measured by the 

Classroom Assessment Literacy Inventory, for preservice teachers? 

The data resulting from the preservice teacher group (N = 67) demonstrated a 
reasonably good level of internal consistency reliability, a = .74. On average, preservice 
teachers answered slightly less than 19 out of 35 items correctly. Out of the seven 
competency areas, as delineated by The Standards, the highest overall performance for 
preservice teachers was found for Standard 1 —Choosing Appropriate Assessment 
Methods ( M = 3.25; maximum possible score = 5). The lowest performance was found 
for Standard 5— Developing Valid Grading Procedure ( M = 2.06). The results for the 
preservice teachers for each of the seven standards are presented in Table 1 . 



Insert Table 1 here 

On only 4 of the 35 items did 90% or more of the preservice teachers answer the 
item correctly. One item each came from Standard 1 —Choosing Appropriate Assessment 
Methods and Standard 2— Developing Appropriate Assessment Methods', two items came 
from Standard 3— Administering, Scoring, and Interpreting the Results of Assessments. 

On five of the 35 items, 25% or fewer answer the item correctly. One item came 
from Standard 2 —Developing Appropriate Assessment Methods', two items each came 
from Standard 5— Developing Valid Grading Procedures and Standard 1 —Recognizing 
Unethical or Illegal Practices. 
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Research Question 2 : What is the level of assessment literacy, as measured by the 

Classroom Assessment Literacy Inventory, for inservice teachers? 

The data resulting from the inservice teacher group (N= 197) demonstrated a 
mediocre level of internal consistency reliability, a = .57. On average, inservice teachers 
answered slightly less than 22 out of 35 items correctly. Out of the seven competency 
areas, the highest overall performance for inservice teachers was found for Standard 
3 —Administering, Scoring, and Interpreting the Results of Assessments ( M = 3.95; 
maximum possible score = 5). The lowest performance was found for Standard 
5— Developing Valid Grading Procedures (M = 2.06). The results for the inservice 
teachers for each of the seven standards are presented in Table 2. 



Insert Table 2 here 



On 8 of the 35 items, 90% or more of the inservice teachers answered the item 
correctly. Two items each came from Standard 1 —Choosing Appropriate Assessment 
Methods, Standard 2— Developing Appropriate Assessment Methods, Standard 
3 —Administering, Scoring, and Interpreting the Results of Assessments, and Standard 
7 —Recognizing Unethical or Illegal Practices. 

On six of the 35 items, 25% or fewer answered the item correctly. One item came 
from Standard 2 —Developing Appropriate Assessment Methods', three items came from 
Standard 5 — Developing Valid Grading Procedures; and two items came from Standard 
7 —Recognizing Unethical or Illegal Practices. 
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Research Question 3 : How does the assessment literacy of preservice teachers 
compare to that of inservice teachers? Are there any significant differences 
between the two groups? 

Standard and total scores for the two groups of teachers were compared by 
conducting independent-samples 1-tests (a = .05). Examination of the results revealed 
that significant differences existed between the two groups for scores on 5 of the 7 
Standards, as well as for the total scores. In all cases where there were significant 
differences, the inservice teachers scored significantly higher (i.e., they were more 
assessment literate) than their preservice counterparts. The largest discrepancies were 
found for Standard 3, the total score, and Standard 4, respectively. For Standard 3, the 
inservice teachers scored significantly higher (M = 3.95, SD = .95) than the preservice 
teachers (M = 3.24, SD = 1.00), 1(262) = 5.23, p < .05, two-tailed. For the total score, the 
inservice teachers scored significantly higher (M = 21.96, SD = 3.44) than the preservice 
teachers ( M = 18.96, SD = 4.65), 1(262) = 4.85, p < .05, two-tailed. For Standard 4, once 
again the inservice teachers scored significantly higher (M = 3.36, SD = 1.08) than the 
preservice teachers ( M = 2.67, SD = 1.19), 1(262) = 4.36, p < .05, two-tailed. Significant 
differences were also found for Standards 1, 2, and 7. There were no significant 
differences found between the groups for Standards 5 and 6. Interestingly, both groups 
performed the poorest— and at the same exact level— on Standard 5. The results of all i- 
tests are presented in Table 3. 



Insert Table 3 here 
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Discussion 

Many of the results of this study parallel those of an earlier study (Plake, 1993; 
Plake, Impara, & Fager, 1993) that used the original version of the instrument and 
focused on the assessment literacy of inservice teachers. With respect to overall 
performance on the 35 items, the average score was equal to 22 items answered 
correctly — quite similar to the average score of 23 obtained by Plake (1993). In the 
earlier study, the highest mean performance for a given competency area was on Standard 
3 — Administering, Scoring, and Interpreting the Results of Assessments', the lowest 
performance was on Standard 6— Communicating Assessment Results. In the present 
study, the highest mean performance was also on Standard 3; the lowest was on Standard 
5 — Developing Valid Grading Procedures. Reliability analyses also revealed similar 
values for internal consistency (a = .54 and .57 for the original study and the study at 
hand, respectively). 

The results for the preservice teachers also reflected those from a recent study, 
which also used the original instrument but collected data from preservice teachers 
(Campbell, Murphy, & Holt, 2002). In that study, the highest mean performance was on 
Standard 1 — Choosing Appropriate Assessment Methods', the lowest performance was on 
Standard 6— Communicating Assessment Results. In the present study, the highest mean 
performance was also on Standard 1 ; the lowest was on Standard 5 —Developing Valid 
Grading Procedures. Reliability analyses revealed identical values for internal 
consistency (a = .74 for both the original study and the study at hand). 

Comparisons between preservice and inservice teachers of the seven competency 
area scores revealed significant differences on five of the seven areas, as well as on the 




20 



Preservice Versus Inservice Teachers’ Assessment Literacy... 20 



total scores. In all cases where significant differences were found, the inservice teachers 
scored higher than their preservice counterparts. Both groups demonstrated their poorest 
performance on Standard 5 —Developing Valid Grading Procedures, followed closely by 
Standard 6— Communicating Assessment Results. 

Research has shown that traditional teacher preparation courses in classroom 
assessment are not well matched with what teachers need to know for classroom practice 
(Schafer, 1993). The traditional focus has historically been on large-scale (standardized) 
testing (Schafer, 1993), although this trend is changing. One course in assessment and 
measurement may truly be insufficient to cover everything teachers need to know. 

This fact is made even more troublesome when considering that many teacher 
preparation institutions and states do not even require a course in assessment (Campbell, 
Murphy, & Holt, 2002; Shafer, 1993). As of January 1998, only 15 states had teacher 
certification standards that required competence in assessment, and 10 states explicitly 
required a course in assessment; however, 25 states held no expectation of competence in 
assessment (Stiggins, 1999b). The majority of states and institutions simply embed 
assessment content into other teacher education coursework; students then learn about 
assessment and measurement from instructors who typically possess no expertise in 
educational assessment (Quilter, 1999). 

However, instruction from individuals with expertise in educational assessment 
may not be enough. It may be more important, not that the instruction is presented by 
experts, but that these measurement specialists better understand the reality of K-12 
classrooms. Specifically, it is important that they understand that assessment is an 

i 

integral component of instruction and goals for student learning (McMillan, 2001; 
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Pilcher, 2001). Teachers have indicated that they are more concerned with the day-to-day 
issues related to the application of assessment processes and less with fundamental 
measurement principles (Rogers, 1991). Hopefully, then, those who teach courses in 
assessment and measurement can teach preservice teachers to see this vital connection 
between assessment and instruction, making assessment more applicable to their views of 
teaching. 

With respect to the concept of assessment literacy, Popham (2003) has called for an 
increased effort among the measurement community at large to promote assessment 
literacy on the part of policymakers, practitioners (including teachers, administrators, and 
counselors), public, and parents. A more assessment literate citizenry is less likely to 
tolerate misuse of assessment and, specifically, assessment results. Stiggins (1995) offers 
several guiding principles for educators to follow in order to promote assessment literacy. 
These guiding principles suggest that educators should: 

• start with a clear purpose for assessment, 

• focus on achievement targets, 

• select appropriate assessment methods, 

• adequately sample student achievement, and 

• avoid bias and distortion. 

Stiggins (1995) continues by stating that these standards of assessment quality are not 
negotiable, nor is the expectation that they be met every time educators assess student 
achievement. However, research shows that these standards are seldom met— due to fear 
of assessment and evaluation, insufficient time to assess properly, or public perceptions 
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The day-to-day work of classroom teachers is multifaceted, to say the least. 
However, none of these daily responsibilities is more important— or more central— to the 
work of teachers than that of assessing student performance (Mertler, 2003). Previous 
studies have reported that teachers feel— and actually are— unprepared to adequately 
assess their students (e.g., Mertler, 1999, 1998; Plake, 1993). They often believe that they 
have not received sufficient training in their undergraduate preparation programs in order 
to feel comfortable with their skills in making assessment decisions. This, coupled with 
the fact that inservice teachers outscored preservice teachers on nearly every subscale in 
this study, may raise substantial questions about the usefulness— or, perhaps more 
importantly, the appropriateness— of assessment training in preservice teacher education 
programs. 

Another question worthy of consideration— and further research— is whether or not 
a majority of assessment training is an “on-the-job” type of training; in other words, are 
assessment skills best learned through classroom experience as a teacher, perhaps once 
teachers can place the notion of “assessment” in a specific context, as opposed to learning 
them as an undergraduate? Does undergraduate training provide the necessary foundation 
for this on-the-job training? At a minimum, the present study highlights specific 
competency areas— namely, developing valid grading procedures and communicating 
assessment results— where both preservice and inservice teachers need remediation and 
additional support. Additionally, the measurement community must take on the 
responsibility of improving assessment literacy among all educational stakeholders. 
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Table 1 

Means and Standard Deviations for Preservice Teachers by Standard and Total Scores 



on Classroom Assessment Literacy Inventory 



Standard 


Mean 


Standard Deviation 


Standard 1 

Choosing Appropriate Assessment Methods 


3.25 


1.03 


Standard 2 

Developing Appropriate Assessment Methods 


2.78 


.83 


Standard 3 

Administering, Scoring, and Interpreting the 
Results of Assessments 


3.24 


1.00 


Standard 4 

Using Assessment Results to Make Decisions 


2.67 


1.20 


Standard 5 

Developing Valid Grading Procedures 


2.06 


.95 


Standard 6 

Communicating Assessment Results 


2.27 


1.32 


Standard 7 

Recognizing Unethical or Illegal Practices 


2.69 


1.13 


Total Score 


18.96 


4.65 



Note: N = 61 



27 

o 

ERLC 



Preservice Versus Inservice Teachers’ assessment Literacy... 27 



Table 2 

Means and Standard Deviations for Inservice Teachers by Standard and Total Scores on 



Classroom Assessment Literacy Inventory 



Standard 


Mean 


Standard Deviation 


Standard 1 

Choosing Appropriate Assessment Methods 


3.74 


.86 


Standard 2 

Developing Appropriate Assessment Methods 


3.18 


.89 


Standard 3 

Administering, Scoring, and Interpreting the 
Results of Assessments 


3.95 


.95 


Standard 4 

Using Assessment Results to Make Decisions ' 


3.36 


1.08 


Standard 5 

Developing Valid Grading Procedures 


2.06 


.85 


Standard 6 

Communicating Assessment Results 


2.57 


1.23 


Standard 7 

Recognizing Unethical or Illegal Practices 


3.10 


.81 


Total Score 


21.96 


3.44 



Note: N = 197 
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Table 3 



t-Test Results for Comparisons of Scores for Preservice and lnservice Teachers 



Standard 


Group 


Mean 


t-statistic 


p-value 


Standard 1 


Pre service 


3.25 


3.79* 


<.001 




Inservice 


3.74 






Standard 2 


Preservice 


2.78 


3.28* 


.001 




Inservice 


3.18 






Standard 3 


Preservice 


3.24 


5.23* 


<.001 




Inservice 


3.95 






Standard 4 


Preservice 


2.67 


4.36* 


<.001 




Inservice 


3.36 






Standard 5 


Preservice 


2.06 


-.03 


.975 




Inservice 


2.06 






Standard 6 


Preservice 


2.27 


1.69 


.093 




Inservice 


2.57 






Standard 7 


Preservice 


2.69 


2.77* 


.007 




Inservice 


3.10 






Total Score 


Preservice 


18.96 


4.85* 


<.001 




Inservice 


21.96 







* p < .01. 
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